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ABSTRACT 


The Ballistic Missile Defense Organization has been developing the capability 
to evaluate one or more high-rate sensor/hardware combinations by incorporating 
them as payloads on a series of Miniature Seeker Technology Insertion (MSTI) flights. 
This publication represents the final report of a 1 993 study to analyze the potential 
impact of data compression and of related communication system technologies on 
post-MSTI 3 flights. 

Lossless compression is considered alone and in conjunction with various 
spatial editing modes. Additionally, JPEG and Fractal algorithms are examined in 
order to bound the potential gains from the use of lossy compression. But lossless 
compression is clearly shown to better fit the goals of the MSTI investigations. 
Lossless compression factors of between 2:1 and 6:1 would provide significant 
benefits to both on-board mass memory and the downlink. For on-board mass 
memory, the savings could range from $5 million to $9 million. Such benefits should 
be possible by direct application of recently developed NASA VLSI microcircuits. 

It is shown that further downlink enhancements of 2:1 to 3:1 should be feasible 
through practical modifications to the existing modulation system and incorporation of 
Reed-Solomon channel coding. The latter enhancement could also be achieved by 
applying recently developed VLSI microcircuits. 
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I. BACKGROUND/PROBLEM DEFINITION 


The Ballistic Missile Defense Organization (BMDO) has been developing the 
capability to evaluate one or more high-rate sensor/hardware combinations by 
incorporating them as payloads on a series of Miniature Seeker Technology Insertion 
(MSTI) flights. This publication represents the final report of a 1993 study to analyze 
the potential impact of data compression and of related communication system 
technologies on post-MSTI 3 flights. 

For the purposes of discussion here, it is assumed that the principal purpose of 
a MSTI flight is to provide the capability to thoroughly evaluate the characteristics of 
one or more image sensor/hardware combinations. This includes the ability to 
determine not only that a system worked, but if it didn't, why? The following 
discussions will make certain system baseline parameter assumptions and then 
identify how data compression can improve MSTI capability to achieve these 
evaluation goals and others as well. 

BASIC PROBLEM 

During the lifetime of a flight there can be expected to be several Sensor 
Events which will nominally last only 10 minutes each. But the data generated during 
a Sensor Event is at a much higher data rate than can be communicated in real-time. 
This is the fundamental problem! The data system must be such that for each Sensor 
Event: 

la) Sensor/hardware operation can be verified. 

lb) If operation is flawed, one can determine why. 

lc) A full visual demonstration of targeting is possible. 

l d) Sample real data from distinct scenarios is provided for 
• Future analysis. 
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• Verification/upgrade of "Kinetic Energy Weapons Hardware In the 
Loop Simulator" (KHILS) simulation capability; KHILS is located 
at Eglin Air Force Base. 

1e) Ecological monitoring. 

Baseline Parameters 

Figure 1 illustrates the key elements of the on-board data acquisition system 
and uses parameters obtained from meetings with Phillips Laboratory and Spectrum 
Astro in early 1993. 

Multiple sensors generate image data at 30 frames/second (f/s) for a Sensor 
Event's time period which we will henceforth assume to be 10 minutes. Nominal 
frame sizes are a baseline 256 picture elements x 256 lines at 8 bits/pixel (b/p), but 
frame sizes may be as high as 1024 x 1024 x 12. Use of the latter specification will 
correspond to worst case (maximum) conditions in future discussions. 

Each "event" is stored in mass memory for later transmission. A baseline sensor 
generates approximately 15 Mbits/s, which for a full Sensor Event interval (10 minutes) 
results in an accumulation of about 1 Gbyte of data or 18,000 frames. However, a 
single 1024 x 1024 x 12 sensor could generate 28 Gbytes of data for a Sensor Event, 
and do so at 377 Mbits/s. 

What Data is Needed? 

Figure 2 focuses on individual frames of image data. On-board tracking can be 
expected to maintain lock on "something." If this something is the desired "target," the 
sensor/hardware combination is working. If the tracker veers off from the real target, it 
is important to be able to discover why by analyzing the data in the many frames 
generated during the same time period (i.e., those the tracking system would have 
used when the system failed). 

The target itself might typically lie within a 16 x 16 area during proper operation 
(assuming the baseline 256 x 256 size image). 
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3-4 SENSORS 



Fig. 1. Key On-board Data System Elements. 
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Fig. 2. The Content of Frames 
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Meeting the Goals in Paragraphs (la) through (1e) 


2a) If we are seeking full verification that some sensor/hardware combination 
works, only the central area (containing the target) is needed. Probably 
close to full frame rate is desired for this central area. 

2b) If the system fails, then all the data in all the frames in the time period 
around the failure are needed (at full fidelity). 

Since when and if such an event might occur is unknown, this implies 
that no data should be discarded or degraded on its way into 
mass memory. 

2c) To visually demonstrate what a sensor "saw" (e.g., make a movie), a 
significantly lower frame rate at some possibly lower quality is probably 
acceptable. 

2d) To provide complete data sets, clearly all the data must be recovered 
without degradation. 

2e) To demonstrate sensor capability to support "ecological monitoring" 
suggests a requirement for gathering sets of full quality images for 
ground processing and study. No data in these sets would be discarded 
or degraded on their way into mass memory. However, these sets of 
images should be far smaller than those gathered for the primary MSTI 
mission. (This is because such images can be acquired at a relatively 
very low frame rate.) This requirement should thus have 
negligible impact on data storage and downlink requirements. 

These are all important observations and will directly impact the approaches to 
data compression which are treated later. 


4 



DATA SYSTEM ELEMENTS AND TERMS 


For our purposes at this point, Data Compression can be looked at as any 
mechanism for representing a set of data. (The sets of data we'll be interested in 
are various sub-sets of image data generated by a MSTI Sensor Event.) Of course, the 
goal is to reduce the number of bits required, compared to a Standard Representation 
(i.e., uncompressed). The factor by which the number of bits is reduced is usually 
called the Compression Factor (CF). If a Standard Representation is used, CF=1. 

When dealing with images, a normalized image rate in bits/pixel (b/p) is often 
given. This typically reflects the number of bits originally used to quantize each 
individual pixel (e.g., the 8 b/p assumed for a baseline sensor). A compression rate is 
then the average b/p required by a particular compression scheme. 

Rate/Quality. Generally, a compressed data set must be "decompressed" or 
reconstructed back into an approximation of the Standard Representation before 
it can be used. When the reconstruction produces an exact reproduction of the original 
Standard Representation, the method of compression is said to be "lossless" or 
"noiseless". Otherwise, when the reconstruction produces an imperfect reproduction of 
the original Standard Representation, the method of compression is called "lossy". 

Figure 3 illustrates the relationship between rate and quality (of reconstruction) 
relative to the uncompressed image data, when data compression is applied to 
individual image frames (intraframe compression). 

Lossless compression of 2:1 is typically achievable with intraframe 
compression. But the precise CF is data dependent and could be slightly less than 2:1 
or as much as 3:1 for MSTI-like data. Of course, the algorithms used can make a 
difference, too. In particular, using information from adjacent frames (interframe) could 
substantially increase the possible compression factor for lossless compression of 
some forms of MSTI data. We will call this kind of compression "Frame-to-Frame 
Lossless" or "F-to-F" lossless for short. In providing estimates of system impact, later 
discussions will assume an estimated average CF = 6:1 for F-to-F lossless 
compression. 
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Fig. 3. Rate/Quality Relationships. 

If errors are allowed in the reconstruction of some data set, the general rule is 
that higher compression factors are achievable (as compared to lossless). Again, the 
methods used can substantially impact the achievable CF (for a specified level of 
quality). These points are illustrated in Fig. 3. 

Rice Algorithms. This report will ultimately demonstrate major advantages of 
using JPL-developed adaptive lossless compression algorithms for MSTI needs. 
These have been demonstrated to be superior to any other lossless techniques for 
imaging and have recently been implemented as high-speed custom micro-circuits. 
Discussions of Baseline Intraframe lossless compression will assume the application 
of these chips. Where applicable, F-to-F variations in lossless coding can probably 
make use of the same chips, with some external hardware. Appendix A provides a 
discussion of the primary Rice Algorithms, current VLSI implementations and a list of 
references. Appendix A mentions a JPL chip being developed for the deep-space 
Cassini Project. This same chip was tested at KHILS for an application to the MSTI 3 
downlink. 
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Lossy Compression. For those situations where imperfect reproduction is 
acceptable for at least some of the data generated by a Sensor Event, we will 
investigate the consequences of several approaches. These will include simple 
spatial/temporal editing modes (which can also make use of lossless coding) as well 
as the commercial JPEG algorithms and powerful systems under development which 
use FRACTALS. 

END-TO-END BLOCK DIAGRAM 

Figure 4 shows how the major end-to-end data system elements and data 
compression fit together. 

As shown, one or more sensors generate high rate image data (see Fig. 2) of 
which a small subset can be transmitted to the ground in real-time. This small subset 
cannot satisfy all the goals in paragraphs (la)-(le), so further discussion will focus on 
the non-real-time acquisition and return of "required" data. See paragraphs (2a)-(2e). 



? 


Fig. 4. End-to-End Data System Elements 

The high rate image data is placed in a mass storage device and then later 
downlinked to the ground for processing, display, and analysis. As we'll see, it may be 
desirable to make this downlink process "interactive" because of the length of time 
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needed to transfer all "desired" data from a Sensor Event. In this case, received data is 
immediately viewed to determine the content of subsequent transmissions (see the 
line designated "Commands" in Fig. 4). 

Compressor Location On-Board 

The on-board data system in Fig. 4 shows two boxes labeled "compress". This 
is because data compression applies to both locations. For several reasons, not all of 
them technical, a MSTI 3 demonstration of lossless compression will place the 
compressor after mass memory. However, for post MSTI 3 missions it is not yet clear 
where it will be possible to locate it. 

As noted in paragraphs (2a), (2d) and (2e), goals (la), (Id) and (1 e) cannot be 
achieved unless all the data is placed into mass memory in a form which can 
ultimately be reconstructed on the ground without error. That is, if compression is 
used before storage in mass memory, it should be lossless compression. 

Ultimately, the advantage in applying compression before storage in mass 
memory is: 

a) A reduction in memory requirements, or 

b) An increase in memory capability for a given memory capacity. 

Both of these can translate into significant dollar savings. This potential is quantified in 
a later section. 

Note that the placement of the compressor after mass memory on the MSTI 3 
demonstration precludes achieving either of these advantages before MSTI 4. 

There are some subtle implementation consequences to compressing 
before storage in mass memory. Later, we will identify lossy compression modes for 
data received from mass memory. These include spatial/temporal formatting 
techniques (later identified as BGFGs), which make use of lossless coding, JPEG 
algorithms and FRACTALS. The JPEG and FRACTAL techniques can only be 
applied to uncompressed data. This would imply that data which had 
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been placed into mass memory compressed (Lossless), would need to be 
"decompressed" before these other techniques could be used. 

Implementing these spatial/temporal formatting techniques using only lossless 
coding would not necessarily require an on-board lossless decompressor, but 
could require careful attention to the manner in which compressed data was "data 
based" into memory. This in turn could influence the choice of "compression chip" 
used (i.e., it could be different than the schedule-mandated choice for MSTI 3). 

The Downlink 

The advantages of data compression to the downlink process are much like 
those in the situation for on-board mass memory. The link rate is inadequate, under 
almost any realistic assumptions. Data compression decreases the time needed to 
communicate through a data link of a specific capacity. But the potential CF for 
downlinking is higher than for on-board mass memory because some situations 
may not require lossless reconstruction of all the data. This subject is treated later. 

Information provided by Ray Rew of Spectrum Astro in March 1993 led to a 
proposal for a future practical upgrade of the uncoded 1 Mbit/s MSTI 3 downlink, at a 
bit error probability of 10 6 , to from 2 to 3 Mbits/s with far superior error characteristics. 
This proposal is repeated as Appendix B and includes discussion on the interaction of 
compressed data and channel errors. 

The Ground Data System 

The position of a decompressor relative to mass memory on the ground cannot 
yet be determined, as illustrated in Fig. 4. 

Thus far there has been virtually no discussion of ground concerns. Our 
assumption has been that decompression could always be done in non-real-time, 
using software methods. This will certainly be the case for initial compression 
experiments. Thus decompression would initially follow extraction from mass 
memory (although the decompressed data would probably still reside in some other 
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mass memory form, not shown in Fig. 4). But in the longer run, this area needs greater 
consideration. 

For example, to interactively respond to transmitted data, perhaps altering what 
is sent next, a decompressor would need to operate as fast as the downlink produces 
the data. Assuming a 1 Mbit/s downlink, data compressed 2:1 (lossless, intraframe - 
the baseline) would need to be decompressed into display data at a 2 Mbit/s rate. A 
software approach might be feasible. But if the compression factor were 50:1, 
decompressed data would generate 50 Mbits/s of display data. A purely software 
solution is unlikely. 

This is simply one of numerous ground considerations that ultimately influence 
the viability of various compression approaches discussed in later sections. Ground 
processing is part of the overall data system and needs to be considered in concert 
with on-board data system developments. 
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II. ON-BOARD MASS MEMORY IMPACT 


Table 1 illustrates the potential advantages to on-board mass memory capacity 
requirements based on a number of assumptions, specified below: 

1 ) Sensor Event Parameters: 

Frame Rate = 30 f/s 
Collection Interval = 10 minutes 

2) Baseline intraframe lossless compression achieves an average CF of 
2 : 1 . 

3) Frame-to-Frame (F-to-F) lossless compression achieves an average CF 
of 6:1. 

4) A baseline Sensor Event for one sensor (256 x 256 x 8) requires 1.17 
Gbytes of memory. 

5) Worst Case (maximum) assumption of an uncompressed Sensor Event 
for a single sensor (1024 x 1024 x 12) is 28 Gbytes of data. 

Then Table 1 shows, for 1 and 4 sensors, the total accumulation for a single 
Sensor Event, with and without lossless compression. 

Compression Value 

Consider assigning a value to the application of lossless compression by the 
following: 

Let 


0 = Total Cost of Memory if compression is not applied. (1 ) 
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Table 1. Mass Memory Data Accumulation 




ACCUMULATION TOTAL, GBYTES 


CF 

1 SENSOR 
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UJ </) 
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4.68 
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UJ </) 

</) z 

2:1 

0.58 

2.34 

< UJ 
CD </) 

6:1 

0.19 

0.78 

H 

i§ 

1:1 

28 

112 

< « 
2 

2:1 

14 

56 

6:1 

4.7 

18.7 


The Compression Value (CV) is then defined as the difference in the cost of 
memory, with and without compression: 


cv = 0 " §f ( 2 ) 

where 0/CF is the cost of memory needed when compression is used. 

Various estimates for the true cost of "flight" mass memory have been put forth. 
These have ranged between 


$100 K/Gbyte 


up to 


$1 million/Gbyte 


(3) 


(4) 


Figure 5 illustrates the cost/value (6 and CV) of Eq. (2) for both of these assumptions. 
Simply interpret the ordinate as either in $100K units or in $1 million units. 
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MEMORY REQUIRED WITHOUT COMPRESSION, GBYTES 
Fig. 5. Mass Memory Compression Value 


If we consider the baseline as four sensors with the minimum (256 x 256 x 8) 
format, 4.68 Gbytes are needed. Assuming a cost of $100K/Gbyte, the uncompressed 
memory requirement is $468K and the compression value is 468 - 468/2 = $234K 
assuming a 2:1 compression. But this is $2.34 million at the other end of the scale. 
Assuming large format sensors, requiring a worst case 112 Gbytes (for four 1024 x 
1024 x 12 sensors), the value of compression is a minimum of $5.6 million for 2:1 and 
$9.3 million for 6:1 (F-to-F), assuming the lowest cost of $100K/Gbyte. 

If the cost of memory were only double the minimum assumption of 
$100K/Gbyte, the potential value of compression would go to $11.2 million and $18.6 
million respectively for 2:1 and 6:1. This would suggest that performing even the 2:1 
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compression before loading in mass memory should have value far exceeding 
compression cost. 

Collection Interval 

Another way to look at the impact of compression and memory is to assess the 
limitations imposed on the allowed collection interval. Figure 6 shows the impact when 
the worst-case sensor complement of four 1024 x 1024 x 12 sensors is assumed. 
Without compression 112 Gbytes of mass memory are required to achieve the 
"desired" full 10 minute Sensor Event. Only 56 Gbytes are needed if 2:1 compression 
is to be achieved (baseline lossless compression) and only 18.6 Gbytes if a 6:1 F-to-F 
capability were achieved. But without compression, having "only" 19 Gbytes would 
limit collection intervals to less than 2 minutes instead of the desired 10. 

A similar graph appears in Fig. 7 where four 256 x 256 x 8 sensors are 
assumed. With only 1.17 Gbytes of mass memory, a system without compression could 
collect only Sensor Events which were 2.5 minutes in length. 

Again, the advantages of employing lossless compression before mass memory 
storage are quite obvious, particularly when large format sensors are anticipated. 


FULL SENSOR EVENT INTERVAL 



Fig. 6. Mass Memory Impact on Collection Interval, 
4 1024 x 1024 x 12 Sensors 
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COLLECTION INTERVAL, min 


FULL SENSOR EVENT INTERVAL 



Fig. 7. Mass Memory Impact on Collection Interval, 
4 256 x 256 x 8 Sensors 
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III. DOWNLINK CONSIDERATIONS 


ASSUMPTIONS 

Ray Rew of Spectrum Astro has specified the MSTI 3 downlink as an uncoded 
link with the following description: 

• Downlink Rate = 1 Mbit/s 

• Minimum Error Rate = 10' 6 

Based on this description of the downlink, a technique for upgrading this 
capability to up to 3 Mbits/s with virtually error-free communication has been proposed. 
Implementation should be straightforward and could provide another important 
technology demonstration for BMDO. This approach for incorporating Reed-Solomon 
encoding is described in Appendix B. 

Assuming there is still a possibility for these improvements to be implemented, 
we will assume 


1 < Rp < 3 Mbit/s (5) 

where Rd is the downlink data rate, and Rp = 1 Mbit/s is the baseline. Error 
considerations are also described in Appendix B. 

Duty Factor 

The situation is worse than Rp in Eq. (5) would imply because the link is only 
available for 10 minutes out of a 90 minute orbit. The net result is an “effective" 
average downlink rate of only 

Rp = Rp/9 (1 1 1 < R D < 333 Kbits/s) (6) 

Table 2 illustrates the extent of the disparity between the (effective) downlink 
rate, R'p, and the collection rate, which lists values for the fraction 
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Collection Rate 


( 7 ) 


r d 


Table 2. Collection Rate vs. Effective Downlink Rate 




COLLECTION RATE 



R P (EFFECTIVE DOWNLINK RATE) 

SENSOR TYPE 

Rq, Mbits/s 

1 SENSOR 

4 SENSORS 

256 X 256 X 8 

1 

141 

566 

3 

47 

189 

1024 X 1024 X 12 

1 

3397 

13,590 

3 

1132 

4530 


LOSSLESS TRANSMISSION TIMES 
Transmission Time 

We can begin looking at some of the trade-offs that will need to be considered. 
Figure 8 illustrates the "required” transmission times (i.e., using Rd in Eq. (5), not R D in 
Eq. (6)) to transmit selected subsets of all the frames of a 10 minute Sensor Event, 
assuming four 1024 x 1024 x 12 sensors. This corresponds to the, possibly 
conservative, worst case situation in Table 1(112 Gbytes). 

The ordinate in Fig. 8 shows the number of hours of link time to return all the 
data of a central, square 2 n x 2 n sub-image. The abscissa shows the size of one side 
of this central sub-image, 2 n . This axis can also be viewed as the size of a smaller 
format image to be sent (not a sub-image). 
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TRANSMISSION TIME*, HOURS 


« 3 MONTHS 



SQUARE SUB-IMAGE DIMENSION 


Fig. 8. Transmission Time (active link time) required for transmission of 
full 10-minute Sensor Events using 4 sensors and 1024 x 1024 x 12 
images or central, 2 n x 2 n sub-images. 
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Cases Considered 


The following six assumptions are made about the data compression/downlink 
capabilities: 

a) Data is transmitted uncompressed on a downlink with 1 Mbit/s (8) 

capability. The graph is indicated as (Rd = 1, CF = 1). 

b) The MSTI 3 experimental data compression of 2:1 is assumed (9) 

and the downlink remains at 1 Mbit/s (Rd = 1, CF = 2). 

c) Data is transmitted uncompressed, but the proposed improved (10) 

downlink capability using Reed-Solomon coding achieves a 

downlink data rate of 3 Mbits/s (Rd = 3, CF = 1). 

d) Assumptions (9) and (10) are combined (Rd =3, CF = 2) for a (11) 

system gain of 6:1. 

e) A 1 Mbit/s downlink is assumed but a 6:1 F-to-F or lossless (12) 

compression is achieved (Rd = 1 - CF = 6). This option has 

the same system gain as Assumption (11), but is probably more 
difficult, and does not apply to all sensors. 

f) Assumptions (10) and (12) are combined for an overall system (13) 

gain of 18:1 (Rd = 3, CF = 6). 

Worst Case. The worst case situation occurs when 2 n = 1024, corresponding 
to sending all of the 1024 x 1024 images. With the basic (R D = 1, CF = 1) system, 2489 
hours of downlink time are required; that is, around 3 months. For the most do-able 
high performance capability in Assumption (11), this is reduced to around 414 hours 
(-2 weeks). In the ideal case in Assumption (13), this might be reduced further to about 
6 days. 
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Reducing 2 n . Reduction in image size has a dramatic effect since total data 
accumulation is proportional to (2 n ) 2 . Reduction down to the current baseline image 
size of 256 x 256, reduces the total time to 155 hours. (Rq = 1 , CF = 1 , n = 256.) 

But applying the quite feasible improvements in Assumption (11) reduces this 
time further to 26 hours, or about 1 day (R D = 3, CF = 2, n = 256). 

Using these same improvements and sending only a central 64 x 64 image, 
would require only 1 .6 hours. 

Total Transmission Time (with duty factor) 

Unfortunately, link availability is not continuous as indicated in Eqs. (6) and (7). 
All the times indicated in Fig. 8 must be multiplied by 9 to get the actual duration 
required to get the same data down. A corresponding graph is shown in Fig. 9. 

Clearly, the problems are severe and shouldn't be taken lightly, particularly if 
large format sensors are really anticipated. With the basic implementation (Rq = 1, CF 
= 1) and an image size of 1024 x 1024, the total time necessary is over 2.5 years, 
making transmission quite impractical. Even assuming 256 x 256 size sensors would 
only reduce this down to about 2 months. 

The most feasible implementation in a near-term time frame is the (Rp = 3, CF = 
2) combination. This gets the maximum time down to about 5 months. 

If the "average" 2 n x 2 n frame size were 2 n = 512, the (Rd = 3, CF = 2) system 
could get all the data back in a little over 1 month, perhaps an acceptable situation. 
This goes to about 1-1/2 weeks if the frame size is instead, the baseline 256 x 256. 
Using only 8 bits/pixel, reduces this further to about 1 week. 
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TOTAL DOWNLINK TIME (DAYS) 



Fig. 9. Total Downlink Time (including 1/9th duty factor) 
required for transmission of full 10-minute Sensor Events using 
4 sensors and 1024 x 1024 x 12 images or central, 2 n x 2 n sub-images 







BACKGROUND/FOREGROUND MODES 


Thus far, all compression considerations have employed only lossless coding, 
focusing primarily on getting all the data back, unmodified, thus satisfying all the 
desired goals noted in Paragraphs (la) through (1e) and (2a) through (2e). This 
section focuses on situations where either all the data is not required and/or the 
reconstructed data is not required to have perfect fidelity. Under the following 
conditions, significantly higher compression factors should be possible. 

• Verification; 

• Quick survey (browsing) to discover where a problem developed (for 
subsequent transmission of such data with lossless fidelity); 

• Verification and full-frame visual playback. 


Definitions 

The concept of Background/Foreground Modes (BGFGs) is illustrated in Figs. 
10 and 1 1 . To parameterize the discussion: 

• A full frame size is 2 N x 2 N (baseline N = 8, maximum N = 10). (14) 

• The central (foreground) 2 n x 2 n region of every £th frame in a (15) 

Sensor Event is transmitted and reconstructed without error 
(lossless). 

• Full frames are communicated only every ath frame, where a > £. (16) 

These are Background Frames. Reconstruction fidelity of 
Background Frames is not necessarily lossless. Data not 
communicated is reconstructed by interpolation from transmitted 

data. Note that KHILS is equipped for a variation of this in which 
intervening missing frames are reconstructed by repeating the 
most recent fully reconstructed frame. 
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EVERY ;th FRAME SENT; 
LOSSLESS 


EVERY ath FRAME SENT (COMPRESSED); 
INTERPOLATE MISSING FRAMES 


Fig. 10. BGFG Mode Concept. 
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Fig. 11. BGFG Block Diagram 


Figure 11 can be used to develop an overall expression for a System 
Compression Factor. Here, Rin denotes the data rate associated with 
uncompressed data (all the frames). 


Foreground compression of the central 2 n x 2 n sub-image area is a two step 
process: 

• A i^:1 reduction because only every £th frame is sent, and 

• A y.1 reduction by compressing these central frames. 
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We will carry these definitions into the specification of an overall System Compression 
Factor, but then restrict £ to C, = 1 in evaluating the consequences of varying 
parameters. That is, we will later assume that all central frames are sent (full frame 
rate). 


Note also that we restrict the compression of central frames to be lossless. More 
generally, the compression of the central region could be achieved by any approach, 
such as those considered for background frames. The System Compression Factor 
equation below supports that generality, although it would not appear to have any 
significant value for the MSTI driven goals. 

Foreground 2 n x 2 n lossless compression by a factor of y:1 results in a rate of 

Rjn 

22(N-n) . y . £ ( 17 ) 


If we choose n = 6, N = 8 as realistic baseline parameters, this term becomes 


Rin 

16 ■ Y ■ C 


(18) 


where the central area transmitted is 64 x 64. 


Background Frame reduction in rate is a two step process: 

• An a:1 reduction because only every ath frame is sent, and 

• A (3:1 reduction by compressing each of these frames. 

The net result is a rate of 


Rin 

a(3 


(19) 


which can be reduced slightly by not "redundantly" sending the central area in each of 
these frames (which is already sent with perfect quality). 
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Using these terms we can calculate an overall System Compression Factor as 

sCF = — -J— (20a) 

^ (1 " ^ ( 22(N^)) + ^ 

where the (1 - Y^/ap) term takes into account this central redundant area, assuming 
that a divides 

Henceforth, we will assume that £ = 1 ( no frame rate reduction on 
the central area). Then Eq. 20a reduces to 

sCF = (20b) 

Y (1 * ( 22(N-n)) + ^ 

A Limiting Case. Note that if we take a = 1 (no frame rate reduction on 
background) but make the compression factor p infinite, sCF in Eq. (20b) reduces to 

sCF = y ■ 22(N-n) (21) 

This is the overall reduction obtained by discarding all data outside the central 2 n x 2 n 
area and compressing the center by the factor y. 

Simplification. When ap » y, Eq. 20b can be simplified to 

p2(N-n) 

sCF = 1 2 2 (N-n) ( 22 ) 

Y ap 

Again, if ap » 2 2 (N-n) j S CF converges to Eq. (21). 
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Analysis Assumptions 


The following section will use graphs and tables to describe the potential 
benefits of BGFG modes. We make the following assumptions. 

Lossless Compression 

• y = 2 for lossless intraframe compression of any foreground area. (23) 

• p = 2 for lossless intraframe compression of any background frame. (24) 

• y = 6 for lossless F-to-F compression of data in which adjacent (25) 

frames (in time) are separated only by 1 /30th second. Thus, we 

are assuming that F-to-F compression is not applicable to the 
compression of Background Frames unless a -> 1 . (Actually, F-to-F 
compression does not instantly become useless but the achievable 
compression factor will decrease as the frames become more 
separated in time. We make this y = 6 restriction here because 
of uncertainty and because we are focusing on results where a » 1.) 

It should be understood that the possibility of achieving the y = 6 F-to-F 
compression is highly sensor dependent. The actual values achieved on applicable 
sensors could be more or less than the y = 6 assumption used in these analyses. 

A word about unnecessary overhead. JPL recently implemented a 
separate demonstration of lossless compression for the MSTI 3 environment. In this 
demonstration, the format specification requires the use of redundant 48-bit identifiers 
on each line, as well as additional lines of engineering data at the end of each frame. 
Clearly, these types of constraints increase the overhead and decrease the net 
compression efficiency that can be obtained. The penalty in loss of efficiency is quite 
dramatic when frame sizes are small, as BGFG modes where the central area is 
transmitted could be as small as 64 x 64. The following analysis illustrates the 
magnitude of these penalties. 
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Let the image size be 2 n x 2 n with a quantization of q bits/sample and let y be 
the compression factor achieved on the actual image as we have been assuming. 
Then the number of bits/pixel required on the actual data is given by 


but the overhead adds a penalty of 


12 + q 
2n-2 


bits/pixel 


(27) 


for a total of 


y + ~ 2rii2 ~ bits/pixel (28) 

Equation 27 is plotted in Fig. 12 for q = 12 and q = 8. 

• With a quantization of q = 12, the penalty when compressing (29) 

a 64 x 64 size image (n = 6) is as much as 1 .5 bits/pixel. Instead 

of representing an image with 6 bits/pixel (with y- 2), the 
representation would require 7.5 bits/pixel, a 25% increase. 

• If quantization were q = 8, a 2:1 compression would require (30) 

4 bits/pixel for the real data, but 5.25 bits/pixel for image plus 
overhead. In this case the overhead incurs a penalty of 31%. 

• Now suppose that q = 12 but F-to-F representation yielded (31 ) 

compression of y = 6 (other constraints would need to be 

changed too). Then, y = 6 means that only 2 bits/pixel are 
required for the image data but 3.5 bits/pixel are required for 
image plus overhead. The overhead now incurs a 75% penalty. 

Paying such a penalty is absurd. Subsequent discussions will assume 
that these issues have been resolved so that an overhead penalty is 
negligible for all n. 
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Fig. 12. Overhead Penalty 

Background Frames 

Setting parameters for Background Frames is more difficult since we are now 
allowing "lossy compression." Quality of reconstruction is based on a visual 
observation of reconstructed frames (including reconstructed missing frames), usually 
displayed at a full frame rate (30 f/s). Specifying quality is particularly difficult because, 
by definition, the primary central region is always perfect. Further, since the net 
reduction factor is ot(3, the same reduction factor can be achieved by various 
combinations of a and p with varying results. 
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But we can make certain practical assumptions on techniques and technology 
that will allow us to "see" the approximate consequences of pursuing further 
developments that might offer benefits in later flights. 

• Lossless - P = 2 (see paragraphs 24 and 25). (32) 

• JPEG - We will assume that operation of the JPEG algorithms (33) 

can be acceptably achieved with a compression factor of p = 1 0. 

• FRACTALS - We will assume p = 100 when using FRACTAL- (34) 

based algorithms. 

Certainly, these assumptions are subject to revision, but they provide a likely 
range of acceptable lossy compression factors. The FRACTAL and JPEG approaches 
are identified as being likely candidate compression algorithms that might achieve the 
two ends of this range. The p = 10 factor might be too low for JPEG and the p = 100 
factor might be too high for FRACTALS. 

Note that in this study we are excluding the possibility of extremely simple lossy 
approaches which achieve a high p by combining sub-sampling with lossless 
compression. While clearly inferior to JPEG and FRACTALS, such simplicity might still 
accomplish objectives more easily in a hardware-constrained implementation. 

BGFG Analysis 

It is difficult to consider all the possibilities one might consider. In the following 
discussion we will assume that with foreground frames of size 2 n x 2 n and background 
frames of size 2 N x 2 N (see Fig. 10) 


N = n + 2 


(35) 


so that 


22(N-n) = 16 


(36) 
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in Eqs. 20 and 22. Then sCF in Eq. (22) reduces to 


sCF - .| ig (37) 

Y + a(3 


for the BGFG combinations in Table 3. 


Table 3. BGFG Image Sizes Studied 


BACKGROUND 
IMAGE SIZE 

FOREGROUND 
IMAGE SIZE 

1024X1024 
512X512 
256 X 256 

256 X 256 
128 X 128 
64X64 


Figure 13 plots the System Compression Factor sCF from Eqs. (20) and (37) as 
a function of the Background Frame Rate Reduction Factor, a, for various assumptions 
on algorithm usage and using Table 3. 

Y = 2, p = 2. Here baseline intraframe lossless coding is applied to both 
Foreground and Background frames. This is really the baseline concept. With a = 1 
(no frame rate reduction), sCF = 2. As a increases, the impact on sCF decreases, 
because the fraction of total rate generated by background frames decreases. sCF = 
16 at a = 16 (=2 f/s) but doubling a to 32 only increases sCF by about 30%. 

Y = 6, p = 2. In this case we apply F-to-F lossless compression to the 
foreground frames with the assumption that y= 6. With a strict adherence to Eq. (14), 
sCF — > p = 2 as o — > 1 . But this is not realistic since as a -» 1 the frame rate 
approaches 30 f/s and background frames really become foreground frames, allowing 
the use of F-to-F techniques on all data. This is indicated by the solid arrow. 

At a = 8 (=4 f/s) the superior 6:1 compression on foreground frames improves 
sCF by only a small amount. This is because at a = 8, the contribution from 
background frames (at p = 2) dominates. 


30 




SYSTEM COMPRESSION FACTOR 



BACKGROUND FRAME RATE REDUCTION, a 

Fig. 13. BGFG System Compression Factors using 
Table 3 Image Sizes 

Even at a = 16 (=2 f/s) the overall sCF is increased by only 50% over the use of 
intraframe compression alone. 
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Y = 2, p = 10 (JPEG). Now apply JPEG at p = 10:1 on the background frames 
and maintain a y = 2:1 ratio on the foreground frames. Surprisingly, at a = 16 the net 
sCF is almost identical to the one obtained by using the y= 6 F-to-F compression (with 
P = 2). When a -» 32, this mode is significantly worse than with y = 6, p = 2. 

y = 2, p = 100 (FRACTAL). Again the results are surprising. Except at low 
frame rate reductions (4 < a < 8), the advantage of even 100:1 on background frames 
doesn't materialize in large improvements in sCF. At a = 16, this mode is only slightly 
better than y = 6, p = 2 and perhaps twice as good as the baseline (y = 2, p = 2). 

y = 6, p = 10 (JPEG). By introducing F-to-F compression on the foreground 
frames, a dramatic improvement in sCF is obtained at the lower values of a. Or 
interpreted another way, F-to-F improvements to the foreground lossless compression 
factor net an equivalent improvement in sCF, when using JPEG on background 
frames. 

y = 6, p = 100 (FRACTAL). As with JPEG, real improvements occur. Nearly 
100:1 is obtained at a very low a = 4. 

Key Points on sCF. Some key observation points are provided in Tables 4 

and 5. 


If we take into account the results in Fig. 13 and Tables 4 and 5, and the 
downlink assumptions in Equations (4)-(6), we can compute the potential value of 
these modes in speeding up downlink transmission times. Tables 6 and 7 provide 
selected results which specify the Actual Sensor Event downlink times (i.e., including 
the 1/9th duty factor) assuming image sizes of 2 N x 2 N where N = 8 or N = 10 and Rd = 
1 Mbit/s or Rd = 3 Mbits/s. 

The results for 1:1, 2:1 and 6:1 (Table 6), taken directly from Fig. 9, correspond 
to full transmission of all Sensor Data in a lossless fashion (a = 1). 

The remaining results, in Table 7, assume a frame rate reduction factor of a = 8 
(=4 f/s). 
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ALL DATA SENT 


Table 4. Important BGFG Mode Observations, I. 


a 


PARAMETERS 


WHEN 


y = 2, N.A. 


VERIFICATION 


y = 2, p = 2 (BASELINE) 
y = 6, P = 6 (F-to-F) 
y = 2, p = 10 (JPEG) 
y=2, p = 100 (FRACTAL) 


Y= 6, p = 100 (F-to-F, FRACTAL) 


DATA SET GENERATION 
DATA SET GENERATION 
VISUAL 
VISUAL 
VISUAL 


SYSTEM CF 
32 
2 
6 
6 

24 

50 


Table 5. Important BGFG Mode Observations, II. 


a 

PARAMETERS 

WHEN 

SYSTEM CF 


y = 2, P = 2 (BASELINE) 

BASELINE, VISUAL 

11 


y = 2, P = 10 (JPEG) 

LOSSY VISUAL 

22 

8 

y=2, p= 100 (FRACTAL) 

LOSSY VISUAL 

31 

Y = 6, p = 2 (F-to-F) 

LOSSLESS VISUAL 

14 


y= 6, P = 10 (F-to-F, JPEG) 

LOSSY VISUAL 

44 


y s 6, p = 100 (F-to-F, FRACTAL) 

LOSSY VISUAL 

93 


33 





















Table 6. BGFG Actual Downlink Times 
Default a = 1 (all data lossless) 



BGFG ACTUAL DOWNLINK TIME (DAYS) 

IMAGE 

niMCMCIAkl 

sCF 

= 1:1 

sCF 

= 2:1 

sCF 

= 6:1 

DIMENSION 

2 n 

r d =i 

R d =3 

i'- 

ll 

o 

DC 

R d =3 

r d =i 

R d =3 

256 

58 

19 

29 

9.7 

9.7 

3.2 

1024 

933 

311 

466 

155 

155 

51 


NO 

COMPRESSION, 

FULL 

RECONSTRUCTION 

BASELINE, 

FULL 

RECONSTRUCTION 

F-TC 

FUI 

RECONST 

>-F, 

LL 

RUCTION 


Table 7. Selected BGFGs, Actual Downlink Times, a = 8. 
Image Size 2 N x 2 N , Central Area Size 2 N " 2 x 2 N ~2 




BGFG ACTUAL DOWNLINK TIME (DAYS) 

IMAGE 

DIMENSION 

2 n 

sCF 

: = 11 

sCF = 16 

sCF = 30 

sCF = 44 

sCF = 93 

R d =1 

R d =3 

r d =1 

r d = 3 

II 

o 

DC 

CO 

II 

O 

DC 

Rp=1 

R d =3 

II 

o 

DC 

R d =3 

256 

21 

7 

14 

4.8 

7.8 

2.6 

5.3 

1.8 

2.5 

0.83 

1024 

84 

28 

58 

19.4 

31 

10.4 

21.2 

7.1 

10 

3.3 


CO 

_ <0 
CM UJ 

11 5> 

CO. 

CM* O 

t-d 

< 

o 

H 5T 

CQ_ UJ 

II W 

S s 

II H 

ea. O 

CM* $ 

II U. 
W 

o 

ii 

CO. 

<o 

ii 

(JPEG) 

o 

o 

II 

CO. 

CD 

II 

(FRACTAL) 
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Major Observations 


We can now make some major observations. 

If Lossy Compression Is Not Employed: 

a) Improved F-to-F lossless compression would be valuable for sending all 
data back (a = 1). 

b) There is little advantage in using F-to-F lossless compression rather than 
BGFG Modes unless a > 16. 

If Lossy Compression Is Used on Background Frames: 

The net gains are significantly constrained by a 2:1 foreground compression 

factor for a > 8. Improvements by using F-to-F map directly into improvements in 

overall sCF. 

The most significant system benefit from future improvements to 
compression for MSTI (i.e., beyond the baseline 2:1 lossless intraframe compression) 
would come from F-to-F lossless. It is important to determine the feasibility of achieving 
the "estimated" 6:1 F-to-F CF before embarking on costly developments involving lossy 
compression factors. Improvements via F-to-F lossless would impact all aspects of the 
data system: mass memory, downlink for all the data and downlink in conjunction with 
BGFG modes. 

INDEXING THE DATA 

Suppose the most sophisticated BGFG mode just discussed (y = 6, p = 100, a = 
8) were used to rapidly communicate a Sensor Event down (sCF = 93:1). If all goes 
well, the target remains in the center where the data is perfect and the quality of 
surrounding (interpolated) data is high enough to provide a quite adequate full frame 
display. If something goes wrong, the display on the ground should be good enough to 
allow detection of the problem. 
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But to determine why a problem developed, all the image data (full 
frames, perfect fidelity, full frame rate) should be available for a time slice around the 
point where a problem was detected. This is the Critical Data (see paragraph 2b). 

It makes sense to secure this Critical Data before transmitting all the data to 
avoid the consequence of a potential system/spacecraft failure impacting the data. To 
do this means that the mass memory storage of Sensor Events cannot be treated as 
simply storing one long string of data. Instead, there needs to be a means to both 
command and access specific segments of the stored data (as determined by 
observing reconstructed BGFG data on the ground). 

As an example, suppose the size of the Critical data segment was 30 seconds 
(x 1000 frames). Compared to the communication of all data from a full 10 minute 
Sensor Event, this provides a 20:1 reduction in the time required to secure the 
Critical Data (a worst case situation). If the problem only developed on one of four 
sensors, the effective reduction is 80:1 (although other sensor data might provide 
supporting evidence of the problem). 
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IV. SUMMARY 


Much of the information provided here is conceptual and cannot be adequately 
summarized. Graphs, tables and concepts need to be studied and understood. 
However, we can make numerous observations that may be helpful to keep in mind. 

1 An initial program emphasis on lossless intraframe compression makes sense. 
This type of compression appears feasible in a short time frame and can 
significantly improve all aspects of MSTI storage and communication. 
Therefore, it would be desirable to initially demonstrate as many aspects of its 
usage as possible: 

• Compression into memory 

• Compression out of memory 

• Compression within BGFG Modes 

2. Investigations of F-to-F lossless compression might yield valuable performance 
improvements for later flights - affecting all aspects of MSTI storage and 
communication. 

3. BGFG Modes - a wide range of investigation is possible. Such modes will be 
most valuable if F-to-F lossless compression can be effectively used. 

4. Ground processing/decompression/display/command considerations need to 
be integrated into the overall end-to-end problem thinking. 

• Interactively responding to received data from a Sensor Event would 
require: 

a) a fast decompressor 

b) some form of on-board image indexing 

5. Using FRACTALS or JPEG requires uncompressed data from mass memory. If 
lossless compression were used to improve mass memory, an on-board 
decompressor would be needed. 
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* Data compressed into mass memory with lossless coding might be used 
directly in BGFG Modes without a decoder, but would need additional 
formatting. 

Appendix B provides an apparently quite feasible, relatively low-cost 
improvement to the MSTI data link - this would include a possible 3:1 
improvement in data rate and improved error characteristics for error-sensitive 
data (e.g., compressed). This implementation would additionally provide the 
Ballistic Missile Defense Organization with another important technology 
demonstration with possible broad impact on other programs. 



APPENDIX A 


Algorithms for High-Speed Universal Noiseless Coding 




I. INTRODUCTION 


References 1-4 provide the development and analysis of some practical 
adaptive techniques for efficient noiseless (lossless) coding of a broad class of data 
sources. These have been applied, in various forms, to numerous applications. 

Those functions and algorithms most desirable for incorporation in a "coding 
module" which could be implemented using current custom VLSI capabilities were 
presented at the first NASA Data Compression Workshop at Snowbird, Utah, in 1988. 5 
A workshop committee recommended that NASA should proceed and implement this 
"coding module." Since then, both the Jet Propulsion Laboratory (JPL) and the 
Microelectronics Research Center (MRC) at the University of New Mexico have 
implemented first-generation single-chip CMOS VLSI coding modules. 0 ' 0 The MRC 
1.0 pm coding chip was successfully tested under laboratory conditions at up to 900 
Mbits/s. 7 ' 0 A companion MRC "decoding module" is designed to run at up to half the 
maximum rate of the coding module. These first-generation MRC chips are now 
available commercially from Advanced Hardware Architectures in Moscow, Idaho. 10 

Both the MRC and JPL developments are nearing completion of second- 
generation space-qualified versions for the coding modules. 1 1-1 ^ It is anticipated that 
the high performance functionality of these modules, first- and second-generation, can 
serve most of NASA's science data needs where a lossless representation is 
appropriate. 

The intent of this paper is to provide a concise description of the basic 
algorithmic and performance characteristics which are embodied in the coding 
modules. A more general algorithmic development can be found in Ref. 15 along with 
some application notes. Observe that the actual implementations have diverged 
slightly from the definitions provided here and from each other. Most of these subtleties 
will be discussed. 


A-1 



II. THE CODING MODULE 


A functional block diagram of a general-purpose lossless "coding module" is 
shown in Fig. 1. A variation in Rice’s original notation (of subscripting the Greek letter 
\j/) will be used to name various coding operations. Subsequent sections will quickly 
converge to more specific definitions that relate to the VLSI modules being 
implemented. 

The input to this coding module 

X n = x-) x 2 . . . xj (1) 

is a J sample block of n bit samples. Y is a priori or Side Information that might help in 
the coding process. 

The overall unspecific process of representing X n is named PSI?+ so that the 
actual coded result is 


PSI? + [X n ,Y] 

As Fig. 1 shows, the coding process is split into two independent steps 
discussed below. 

Step 1 

A Reversible Pre-processor is a process designed to convert the source 
represented by X n sequences (and Y) into a close approximation to a STANDARD 
FORM Data Source, represented by 5 n sequences. This process usually includes a 
de-correlation procedure (prediction). 

The pre-processor converts each X n (and corresponding Y, if any) into 

5n' = S 1 5 2 ...8j (2) 
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A PRIORI 
OR 
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Y 


PSI?+ 


I 

INPUT I 
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DATA I 
BLOCK | 

+ 

I 


REVERSIBLE 

PRE-PROCESSOR 


x n = x 1 x 2 ....x 



PS1? 

ADAPTIVE 

ENTROPY 

CODER 


6 n - 6^2 ...6j 

STANDARD 
FORM SOURCE 


CODED 

BLOCK 

PSI?[6 n ] 


EXTERNAL 

PRE-PROCESSED 

DATA 


Figure 1. General Purpose Noiseless Coding Module Block Diagram 

a J > 1 sample sequence of n' bit samples. Usually n = n\ and we will henceforth 
assume that here. 


Standard Form Source. Specifically, 

a) Samples of 5 n are the non-negative integers 0, 1 , 2, ..., q (3) 

b) Samples of 5 n are independent (4) 

c) With pj = Pr[8j = i], the probabilities are ordered so that the smaller 
integers occur more frequently, i.e., 

PO > Pi > P2 > • • • ( 5 ) 


d) The source entropy is given as 


H s = -£P' log2 pj bits/sample 
i 


( 6 ) 


The best pre-processor will meet these conditions and produce the lowest H§. 
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Step 2 


An adaptive entropy coder, named PSI? for now, efficiently represents 
(Standard Source) pre-processed 5 n sequences with the coded result 

PSI? [5 n ] = PSI? + [X n , Y] 

This entropy coder is independent of the pre-processor. The pre-processor's 
goal is to achieve performance that remains close to H5 as it varies with time. 

Note that the coding module in Fig. 1 allows for coder PSI? to be used directly 
on externally supplied pre-processed data. 

Prelude to the Details 

A general form of an Adaptive Entropy coder (designed to efficiently represent 
Standard Data Sources), which chooses from multiple algorithm options on a block- 
by-block basis, will be identified in the next section. Specific sets of such code options 
will be defined and incorporated in this structure as a parametrically defined 
adaptive coder. In doing so, the unspecific "PSI?" will be turned into a specific coder 
called "PSIss." 

Finally, the specific parameters of PSIss that are used in current VLSI 
implementations will be identified. 
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III. ADAPTIVE ENTROPY CODER FOR THE STANDARD SOURCE 

PSI? of Fig. 1 represents the general-purpose adaptive coder called PSI11 in 
Refs. 2-4 and 15. Basically, such a coder chooses one of a set of Code Options 
(coding algorithms) to use to represent an incoming block of "pre-processed data 
samples. A unique binary identifier precedes a coded block to tell a decoder which 
decoding algorithm to use. The following discussion will identify specific code options. 

Code Options 


Backup. When no coding of any form is performed on the data, we call this 

PSIbu 


PSIbu [5 n ] = 5 n ( 7 ) 

This representation is used in an adaptive coder when all other available code options 
fail to compress 5 n . References 7-9 call this the "default" option. 

The Fundamental Sequence Code. Recall that the pre-processed 
samples of 8 n are the non-negative integers i > 0. A variable length "Fundamental 
Sequence Code, fs", is defined for each i as follows 

i zeroes 

fs[i] = 000 0 0 0 1 for i > 0 (8) 

That is, simply append a 1 to the end of a sequence of i zeroes. 

The "Fundamental Sequence" itself is the application of fs[.] to all the samples of 
5 n . Following Rice's notation, 1 


PS II [5 n ] = fs^] * fs[S 2 ] * . . . fs[Sj] 


( 9 ) 


1 An asterisk, *, is used to emphasize the concatenation of sequences. 
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is the Fundamental Sequence. This defines Code Option, PSI1. 

Split-Sample Modes. The code option definitions here are basically to "split" 
off the k least significant bits of each 5 n sample and send them separately. The 
remaining n-k most significant bit samples are then coded using PSI1. Specifically, 
with 

5 n = 5-j 5 2 - - - 5j 


Let 


M n >k = 


m-| m2 . . . mj 


( 10 ) 


be the sequence of all the n-k most significant bit samples of 5 n and let 


L k = sb-j * sb 2 * . . . . sbj 


( 11 ) 


denote the corresponding sequence of all the k-bit least significant bit samples of 8 n . 


That is 


5j = mj * sbj 


( 12 ) 


The "Split-Sample" Mode Code Option PSIl.k is defined by 


PSI1,k[8 n ] = PSI1[M n «k] * L k 


(13) 


Note that k = 0 is a special case where 


PSI1,0 = PSI1 


(14) 


and when k = n 


PSI1 ,n = PSIbu 


(15) 
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The Individual Code Words. Note that the individual code word 
assigned to 5j in (12) when code option PSIl.k is applied is given as 

fs[mj] * sbj (16) 

That is, the Fundamental Sequence Code, fs[ ] in (9), is applied to the most significant 
n-k bits of Si, followed by the least significant k bits of 8j. From this description, it should 
be easier to see that the only variable-length-code operation ever required is 
the application of fs[-], since the " sbs" can simply be shifted out, and fs[ ] can be 
implemented without any table lookups. 

Performance of the Individual PSI1,k Options. Under certain familiar 
assumptions about the type of data source (these assumptions will be described in a 
later section), 

the individual "variable length codes" represented by (16) 

were shown by Yeh 17 - 18 to be equivalent to Huffman Codes. (17) 

Thus they are not only extremely simple, they are optimum too. 

But even more important for their application in an adaptive coder, the entropy 
where PSI1 ,k achieves its best performance is at 


Hg = k + 2 bits/sample 


(18) 


and performance remains close to Hg over a range of about ±0.5 bit/sample. Thus 
there is at least one PSI1 ,k option that should provide efficient coding for any 


Hs > 1 .5 bits/sample (19) 

Such conclusions can also be drawn directly from simulations using a broad 
range of data sources. 
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Split-Sample Adaptive Coder, PSIss 


We can now use the Split-Sample Modes described above to replace the 
general coder PSI? in Fig. 1 with a specific class of parametrically defined adaptive 
coders, named PSIss. PSIss is based on the following parameters: 

J = block size > 1 

n = Input Bits/Sample (20) 

N = number of Code Options 
X > 1 (Dynamic Range Parameter) 

A functional block diagram is shown in Fig. 2. 


PSIss 



Figure 2. Parametric Split-Sample Adaptive Coder Functional 

Block Diagram 

The representation of J sample 5 n , using an N option PSIss with parameter X> 
1 , is given by 

PSIss[6 n ] = ID(id) * PSI1 , k(id)[5 n ] (21 ) 
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where 


id = 0, 1, 2, . . . N-1 (22) 

is the integer value of a coder identifier for the options used, and ID(id) is its standard 
binary representation, requiring 2 

[log2 N] bits (23) 


PSI1 ,k(id) is the Split-Sample option specified by id, where 


k(id) = 


n for id = N-l 
X - 1 + id Otherwise 


(24) 


for parameter X > 1. By (15) and (24) the last option is PSIbu in (7). 

Dynamic Range. Except for limiting cases, the range of entropies where 
PSIss can be expected to efficiently represent pre-processed 5 n sequences has been 
shown to be closely specified by 

( P 

A. + 0.5 < H5 < min l ( 25 ) 

( X + N - 0.5 

A close look at this expression shows that each increase in X moves the 
Dynamic Range of efficient performance upwards by 1 bit/sample. 

Performance Graph. A graph of typical performance for PSIss with N = 12, X 
= 1, n = 14 and J = 16 is shown in Fig. 3. 


2 [z] is the smallest integer, greater than or equal to z. 
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ENTROPY, H 6 , bits/sample 


Figure 3. PSIss Average Performance for N = 12, n = 14, X - 1, J = 16 

Choosing the Right Option. The optimum criterion for selecting the best 
option to use to represent 8 n is to simply choose the one that produces the shortest 
coded sequence. That is, 

choose k = k* if 3 


(PSIl.k* [8 n ]) = ^ {^(PSI1,k[5 n ])} 


(26) 


Now, letting 


Fk = ^(PSII [M n -k]) 


we have from (12) 


3 ^(Z) = length of sequence Z in bits. 
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S’fPSII.k [8 n ]) = Fk + Jk 


(27) 


and from (8) - (10) 


J 

Fk = X mi + J ( 28 ) 

i=1 

(i.e., the sum of the most significant n-k bit samples plus the block size). 

Thus (27) and (28) can simplify the decision making in (26) without actually 
coding the data. But this can be further simplified. 

By taking advantage of the randomness in the least significant bits, the 
expected value of Fk can be related to Fo by^.15 

E{Fk|FQ} = 2-kFo + |(1 -2-k) (29) 


which we use as an estimate in (27). We have 

Yl,k (6 n ) = 2-k F 0 + ^ (1 - 2‘k) + Jk 
=<S?(PSI1 ,k [5 n ]) 


(30) 


We can then choose k = k* if 


Yi,k*(8") = k { ^1 ,k (5 n ) } (31) 

and this leads to distinct decision regions based solely on Fo (which by (28) can be 
determined by adding up the original samples since mj = 5j). The boundaries to 
adjacent PSIl.k decision regions are given by 

Fq = | + J (2 k+1 ) bits (32) 
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Any PSI1,k option will generate more bits than PSIbu when 


Fq > ^ [ (n-k) 2 k+1 + 1 - 2 k ] bits 


( 33 ) 


A sample table of decision regions is shown in Table 1 for an N = 8 option 

PSIss. 


Note that if the largest value of k used is k = i , then PSI1 , l will generate more 
bits than PSIbu when 


Fj > (n -l) J bits (34) 

which may be a simpler test than (33) provides, in some cases. 

Table 1. Decision Regions for an N = 8 Option PSIss 


CODE 

OPTION 

F 0 REGION IN BITS 

PSI1,0 

F 0 * 5J/2 

PSI1.1 

5J/2 < F 0 s 9J/2 

PSI1.2 

9J/2 < F 0 s 17J/2 

PSI1.3 

17J/2 < F 0 s 33J/2 

PSI1,4 

33J/2 < F 0 s 65J/2 

PSI1.5 

65J/2 < F 0 s 129J/2 

PSI1,6 

129J/2 < F 0 s (128n-831)J/2 

PSIbu 

(128n-831)J/2 < F 0 


PSIss Implementation Parameters 

The primary PSIss parameters used in the first-generation VLSI 
implementations were as follows;6-9 
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a) J = block size of 8 n = 1 6 

b) n = supported quantization of 8 n samples 

= 12 for JPL chips 
= 4, 5, .... 14 for MRC chips 

c) N = number of code options 

= 11 for JPL chip 
= 1 2 for MRC chip 

d) X = Dynamic Range Parameter = 1 = Starting Option Parameter 

Some Subtle Differences. By Eq. 23, the number of fixed identifier bits does 
not need to be more than [log 2 N] bits. A coder which represents 8-bit data has no 
need for more than 8 code options and thus should need no more than a 3-bit 
identifier. The second-generation MRC design will recognize this situation. However, 
JPL’s first- and second-generation chips and MRC's first generation chip fixed the 
number of identifier bits at four. 

In the JPL case, the first-generation chip only supported input quantization of 12 
bits/sample, so this was not an issue. A second design now provides support for data 
with quantization down to 8 bits/sample. But in doing so the number of code options is 
still fixed at 1 1 , yielding 4 bits for identifiers. 


The JPL support for n < 12 is provided by treating data of lesser quantization as 
right-justified 12-bit data. The consequence of this is that there does not exist a true 
backup (default) mode when input data is not truly 12 bits/sample. For example, 
instead of using an 8 bits/sample PSIbu (e.g., as dictated by Table 1), a JPL second 
generation coder would instead use some intermediate split-sample mode beyond 
PSI1,6. In the rare eventuality that PSIbu is needed, this shortcoming would incur a 
penalty of over 1 bit/sample in the block in which it occurs. Each MRC design uses a 

true backup. 

By (22) and (24), PSIbu in (7) is always assigned the identifier id = N-1. Thus 
an N=1 1 option coder would assign the binary four-tuple identifier "1010" (for ten). 
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JPL's N=1 1 designs instead assign the "all-ones four-tuple” to the PSIbu identifier. 
Similarly, and more generally, the MRC designs assign the "all-ones four tuple" to the 
PSIbu identifier for N>8 and the "all-ones three-tuple" when N<8. 


Another distinction between the JPL and MRC designs lies in the method for 
determining which code option to use. The JPL coders basically use the approach 
illustrated in Table 1 where decisions are based on F Q alone. (This is generally the 
most desirable technique for software implementations because of the minimal 
computation requirements.) MRC coders instead make decisions based on the exact 
bit count for each option (requiring the calculation of each F k in (28)). The difference in 

average performance between the two methods has been shown to be of no practical 
significance. 

By (21), (13) and (14), the form of a coded block is 

ID(id) * PSI1 (M n -k) * Lk 

Here, the JPL and MRC approaches diverge slightly. The MRC format follows the 
definition of L k given by (11) and (12). However, the JPL format splits L k further into k 
subsequences, each containing all the sbs from each sample which are of the same 
significance. While this provided a simplicity in the JPL coder design, it incurs a 
penalty on the operations required by a decoder. 

Some additional differences are noted in a later section on pre-processing. For 
specific details on these implementations, consult Refs. 6-16. 

More Generality 

The adaptive coder we have designated as PSIss is actually a subset of the 
more general coders in Ref. 15 (PSI14 and PSI14.K). The latter coder definitions 

1) Permit X to be zero or negative, thus allowing for additional "low-entropy" 
code options that provide improved performance below 1.5 bits/sample. Ref. 15 and 
earlier papers provide various low-entropy code options which can be incorporated 
into this structure or can stand alone as separate approaches. Yeh has provided a 
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computationally simple algorithm for incorporation in the second-generation MRC 
design. 1 4 

2) Provide a second parameter, K, for adjusting the Dynamic Range. K 
basically specifies a fixed K-bit pre-split of data samples, before or after a pre- 
processor. 

3) An extended coding structure that allows these same algorithms to also 
be applied to the representation of code identifiers. This reduces the overhead penalty 
when operating many-optioned coders on fairly stationary data. 
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IV. THE PRE-PROCESSOR 


The entropy coder, PSIss, as described in the previous section, was designed 
to efficiently represent (pre-processed) Standard Data Sources. PSIss doesn't need to 
know which pre-processor was used to produce its input. However, for an extremely 
broad set of real problems, the general pre-processing function of Fig. 1 can be 
replaced by the more specific Basic Predictive Pre-processor in Fig. 4. It is shown 
imbedded within the complete coding module for your convenience. 

Standard Predictor 

The first part of this pre-processor is a very simple predictor consisting of a 
single sample-delay element. With xj as the i^ sample in X n , this delay element 
"predicts" that xj equals the previous sample: 


xj = xj_i (35) 


CODING MODULE, PSIss+ 



PREDICTION 

Fig. 4. Basic Predictive Pre-processor Within Coding Module, PSIss-t- 

The previous sample could be the last sample from a previous block when coding 
multiple blocks. It is assumed that the sample delay is always initialized with a 
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prediction. But note also at this time that the module's design allows for an External 
Prediction to be supplied in place of this simple one-dimensional form. 

The difference between any sample and its prediction produces the error signal 


and the block of J error values 

A = Ai A2 • • • Aj ( 37 ) 

As a new data source, A sequences tend to be uncorrelated and display a unimodal 
distribution around zero (when data values are not near the boundaries of its dynamic 
range). That is 4 

Pr [Aj = 0] a Pr [Aj = -1] s Pr [Aj = +1] * Pr [Aj = -2] * . . . . (38) 


The Mapper (into Standard Source) 

When the latter condition in (36) is true, the following function will map each Aj 
into a corresponding Standard Source 6j such that 

P0 * pi * P2 * P3 * 


Additionally, it will assure that an n-bit/sample xj from X n produces an n-bit/sample 6j. 
Further, the desired probability ordering of the 6j is more closely approximated when xj 
values are near 0 or xmax = 2 n — 1 . 


4 An equally valid assumption is 


Pr [Aj = 0] 2 Pr [Aj = +1] 2 Pr [Aj = -1] a Pr [Aj = +2] a . . . 


which is the basis of the JPL VLSI chips. It leads to a mapping function which looks very similar to (39) and 
(40). Reference 15 shows that a “decoder* using one mapping function could be used to decode data 
that had been generated using the other mapping function. 
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The Mapper, used in the MRC design, is defined by 


5i = 


2Aj 0 < Aj < 0 
2 | A j | - 1 -0 < Aj < 0 

0 + | Aj| Otherwise 


where 


( 39 ) 


0 = min (xj , xmax -xj) 
xmax = 2 n - 1 


(40) 


A Further Benefit. Suppose the pre-processor in Fig. 4 is set to receive an 
external prediction, and fix that prediction to 

xj = 0 (41) 


Then, tracing through (39) and (40), one finds that 

Si = xj (42) 

That is, the input to the pre-processor, X, is passed directly through unchanged, 
becoming as 8. This means that the separate desired external input line in Fig. 1 (to 
allow the pre-processor to be skipped) can be omitted. 

The Ideal Case 

If one assumes that the distribution of A samples in (38) fits the Laplacian form, 
then the code equivalence result in (16) can be proven. 17 - 18 That is, 

the simple PSI 1 ,k codes are equivalent 
to Huffman Codes for Laplacian 
distributions of prediction errors. (43) 


A-18 



Reference Sample 


Most of those applications which make use of the built-in Predictive Pre- 
processor will occasionally need to incorporate a Reference Sample along with the 
coded prediction errors (e.g., at the start of an image line). Each of the VLSI 
implementations incorporate an optional feature to extract such a Reference Sample 
from an incoming data stream. The JPL and MRC approaches to formatting this 
Reference Sample are distinctly different. Consult Refs. 10, 11, 14 and 15. 
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V. PERFORMANCE COMPARISONS 


References 12 and 18 compare the performance of a complete PSIss+ coding 
module with the well-known Lempel-Ziv, Adaptive Huffman and Arithmetic coding 
algorithms. 
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APPENDIX B 


A Practical Concept for Improving the 
MSTI Downlink Data Rate by 2:1 to 3:1 




OVERVIEW/SUMMARY 


Thanks to Ray Rew of Spectrum Astro, new light has been shed on the characteristics and 
elements of the now and future MSTI downlink. 111 Based on initial discussions with Ray, it 
would appear very likely that there is a technically and fina ncially p ractical approach to 
upgrading the nost-MSTI 3 downlink data rate capabilities by a factor of 2 to .3. 

Basically, the approach would require minor modifications to the current modulation 
system and the incorporation Reed-Solomon coding and decoding chips within the communication 
system. 

The inclusion of Reed-Solomon (RS) coding within a BMDO program would itself provide 
a second demonstration of modern communications technology (in addition to data compression). 
It would offer substantial long-term benefits to future programs in addition to the immediate 2- 
3:1 downlink improvement. (4:1 to 6:1 overall when combined with image compression.) 

This brief report seeks to provide: 

a) Historical perspective of RS coding; 

b) Performance characteristics associated with the MSTI data link; 

c) Technology availability. 

In the interest of securing the potential advantages of RS coding at the earliest possible 
date (e.g., MSTI 4 application) it is important that the information provided here be considered 
and acted on as soon as possible. 

THE UNCODED DOWNLINK 

Error Vulnerability 

The effect of individual communication bit errors on uncompressed data is generally 
isolated to the data sample in which they reside. But the effect on compressed data can be far 
more devastating. Basically, a single error could cause a decompressor to become confused until 
it can be re-initialized (e.g., possibly on a line-by-line basis for the MSTI 3 image compression 

demonstration). 

A similar error susceptibility is associated with the encryption techniques which are 
currently planned. In both cases, single errors tend to cause "bursts" of errors (this will be 
looked at more closely later). 

One can avoid the consequences of this error vulnerability by assuring that the frequency 
of errors is reduced to an acceptably low value. At this time, acceptably low has come to mean 
a bit-error probability for MSTI of 111 

P b < 10 6 0) 
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The Uncoded Channel 


The MSTI 2 communication system (and presumably MST1 3 and on) employ an 
’’uncoded" binary data link 1 ' 1 in the sense that it includes no "error-correcting" coding. In the 
absence of bandwidth constraints, which we will assume for now, the MSTI uncoded data link 
looks exactly like a similarly named uncoded, deep-space data link in terms of performance 
characteristics. Channel error characteristics are shown in Fig. 7, which plots "bit-error" 
probability, P b , versus signal-to-noise ratio, E b /N c (energy-per-bit) in decibels (dB). 

E b /N 0 vs. Rate. Assuming that transmission power, receiver/transmitter antennas (and 
their proximity) remain fixed, E,/N 0 can be viewed as the inverse of data rate. Increasing the 
data rate by a factor of two (3 dB) doubles the Eb/N 0 , and vice versa. (The energy available per 
unit of time applies to a bit’s signal twice as long.) 



E b /N„, dB > 

< Data Rate 


Fig. 1. The Uncoded Performance Curve 
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Point A in Fig. 1 indicates the point on this curve (Et/N,, = 10.8 dB, rate = R bits/s) 
where the frequency of "random" bit errors is 10 6 . This is currently the worst case operating 
point for MSTI as specified in (1). That is, there will usually be a greater E,/N 0 than required 
to achieve a P b = 10' 6 . 1 " This is called Margin, and assures that throughout a mission the 
conditions in (1) are met. 

Point B on the Uncoded Link Performance curve indicates where P b = 10 4 . This occurs 
roughly 3 dB lower where [E b /N 0 » 7.8 dB, rate - 2R bits/s] . Thus 

a) If the symbol (bit) rate over the channel could be increased (the 
bandwidth question) 

and 

b) If it would be possible to accept an error rate as high as P b = 10 4 

then, an increase in data rate by a factor of two could be achieved (with the same margin). 

Bandwidth. In the discussions with Ray Rew m in March, he indicated that 
some minor modifications to the current modulation system should allow an (2) 

increase in channel symbol rate by 3-4 dB . (Of course, Ray would need to more clearly 
indicate what "minor" means.) 

This would seem to take care of item (a) above (although we will later be interested in 
transmitting channel symbols at rates higher than 3R). However, the error vulnerability of both 
compressed and encrypted data makes condition (b) unacceptable. 


CODING THE CHANNEL 

Similar situations existed in the deep-space program in the 1970’s. Before the first 
inclusion of image data compression on Voyager at Uranus, the accepted error rate for 
uncompressed image data was in the vicinity of 10' 3 . To include error-sensitive data compression 
with the communication channels of the time would have required a reduction in data rate by 
approximately a factor of two. This is basically the situation we have here with MSTI, except 
that we’ve already reduced the data rate (to get to P b = 10' 6 ). 

The deep-space situation was somewhat more complicated than the present MSTI 
problem. The baseline communication system for Voyager (and other projects at the time) 
employed a convolutionally encoded/Viterbi decoded channel which would operate at bit error 
rates as high as 10‘ 3 . This was acceptable for pre-data-compression imaging data but not for 
other forms of data (which generally constituted a small fraction of the total data). Hence, the 
non-imaging data was additionally coded with a (24,12) Golay block code (before reaching the 
convolutional coder). 
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AICS 12 ’-' 41 


The Advanced Imaging Communication System (AICS) was introduced as a solution in 
the mid 1970’s. 

AICS employed a powerful, interleaved Reed-Solomon code in combination with the 
convolutional channel. All data , compressed or not, could be communicated through this 
concatenated channel "virtually error free” at roughly the same data rate at which the 
convolutional channel alone achieved a bit-error probability of only 10 3 . 

AICS was patented by NASA in 1976 and the concatenated RS/Viterbi channel has 
become a NASA and International Standard. It appears on virtually all deep space missions, 
starting with Voyager (at Uranus). Various developments have produced both RS coder and 
decoder chips with essentially the same code parameters as originally specified in Ref. 2. 

On the surface, these developments would seem to have little to do with the MSTI 
problem, since MSTI currently employs an uncoded link. On the contrary, we just need to look 
at the problem a little differently. 

References 2-4 provide the original arguments and technical details that were originally 
needed to convince doubters of AICS performance and feasibility (data compression had not yet 
been used either). Subsequent sections will extract portions of these references, and others, to 
address the MSTI application. 

The Reed-Solomon Code 

A simple block diagram of a Reed-Solomon coder with parameters J and E is shown in 

Fig. 2. 


INPUT 



Fig. 2. RS Coder 


B-4 




Such an RS code is non-binary. An RS symbol consists of J bits so there are 2 J 
possible RS symbols. All coding and decoding operations involve RS symbols, not individual 

bits. 

The input to an RS Coder are called "information symbols." As Fig. 2 shows, 2 J -(1+2E) 
information symbols enter the RS Coder to the left. The result of coding is a codeword of length 
2'-l symbols of which the first 2 J -(1+2E) are the same symbols entering on the left. This makes 
the code systematic . The remainder of the codeword is filled in with 2E parity (check) symbols . 

An RS symbol is in error if any of the J bits making up the symbol are in error. E 
represents the number of correctable RS symbol errors in an RS codeword. That is, if E or less 
RS symbols are in error in any way, the coder will be capable of correcting them. 

Tying Down Parameters. For a great many reasons the code parameters chosen 
originally for AICS and eventually implemented in VLSI are 

J = 8 , E = 16 (3) 

Thus RS symbols are bytes and a decoder can correct up to 16 "symbol" errors in a single 
codeword. The RS code structure with these parameters is shown in Fig. 3. 



Fig. 3. Basic RS Codeword Structure, J = 8 , E = 16 
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Interleaving. Errors made by a Viterbi decoded convolutional channel tend to be 
"bursty". That is, errors occur together in clumps (increasingly so at higher error rates). The 
symbol error correcting capabilities of this RS code is ideally suited and matched to these 
characteristics. However, such "burstiness" becomes increasingly severe at high (convolutional 
channel error rates). To avoid a loss in performance, deep space RS coding systems employ 
"interleaving" to spread the effect of long inner-channel error bursts over several codewords. 

As understood now, a near-term MSTI application of RS coding is unlikely to require the 
deep-space interleaving capability because an uncoded data link does not inherently have any of 
these "bursty" characteristics. The currently available RS coder/decoder chips can set interleave 
depth, I, from I = 1 (none), up to I = 8. 

While not immediately an issue, there are certainly many military (SDI) situations in 
which error events come in bursts, even when the underlying channel is uncoded (e.g., other 
components of the end-to-end data link). Thus, for future reference, it is important to at least 
understand the potential burst-error-correcting capability of an interleave RS code. 

The basic burst-error-correcting capability of this J = 8, E = 16 code is a burst of E = 16 
symbols or 128 bits. That is, a complete wipe out of any stretch of 128 bits in a codeword 
(without other errors) could be corrected. 

Interleaving at a depth I basically increases the burst error correcting ability by the factor 
1. Then with I = 8. an error burst of over 1000 bits could be corrected (assuming no other errors 
in the 8 codewords). 

RS Coding on Uncoded Channel 

The performance curve for the uncoded data link of Fig. 1 appears again in Fig. 4 along 
with the performance curve for an RS coder with the just specified parameters J = 8, E = 16, 

I > l. |5) 


As in Fig. 1, error probability is plotted as a function of Ei/N,, in Db, which we know is 
inversely related to data rate. Points A and B are repeated and their relative data rates are 
indicated as R bits/s and 2R bits/s, respectively. 

An RS performance curve is shown as the new graph. Here the ordinate is "probability 
of a Reed-Solomon codeword error, Prs , instead of bit-error probability (which would be 
lower). This is a much more appropriate measure of performance as we will illustrate later. 

Note that "signal-to-noise” ratio needs a more sophisticated interpretation here. E,/N 0 
means energy per information bit. On an uncoded data link the information bits are the same 
as channel bits so this issue never comes up. But with RS coding, not all the bits communicated 
are information bits. Roughly 12% are parity symbols with the J = 8, E = 16 code. 

The RS performance curve shown takes this into account and reflects the true information 
rate (the data rate of the real data). Channel bits are actually communicated at about 0.6 dB 
lower signal-to-noise ratio (and 0.6 dB higher bit-rate). 
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Fig. 4. RS Performance Curve 


As should be apparent, the P RS performance curve is quite different than that for the 
uncoded Data Link. Not only is it located at a lower EyN,, (and hence higher data rate), but 
it is very steep. Error rates drop off by approximately one order of magnitude for every 0.1 dB 
increase in E^/N,, . Basically, once the E,/N 0 exceeds around 6.2 dB (point C), where ^ 

Prs < 10 6 , a "virtually error-free" data link is achieved. The true meaning of a P RS < 10 
is discussed in a later section titled "Error Events. 

Data Rate Improvement. The difference in Et/N c , and hence data rate, between Point 
A and Point C in Fig. 4 provides the potential data rate improvement to post-MSTI 3 missions 
from a direct application of RS coding. The potential data rate improvement is 

A Rate = 4.6 dB (4) 

which is equivalent to a factor of 3 . This is illustrated in Fig. 4 where Point A is indicated to 
be at data rate, R, and Point C at data rate 3R. 

Actual improvements will depend on what modifications to the current modulation system 
are feasible. Such modifications need to allow a higher (channel) symbol rate without bandwidth 
constraints. Initial conversations with Ray Rew suggest that a factor of two or so should be quite 
feasible. 1 ' 1 
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Concatenated RS Coding 


It is of longer-term SDI interest to understand the potential for data rate improvements 
based solely on the available Et/N,, . That is, when modulation systems can accommodate the 
higher bandwidth. Unlike the direct RS coding application, the following discussion is not 
expected to be of immediate interest to the MSTI program. 

The same RS code, concatenated with a K = 7, v = 2 convolutional code (Voyager) could 
operate at approximately 3.2 dB lyN,, to achieve the equivalent of Point C in Fig. 4 That’s 
another factor of two - for a total data rate improvement of 6 . 

Adding Compression. Coupled with an assumption of a 2:1 lossless compression 
capability, this yields a total net potential of 12:1 improvement in downlink capability from 
advanced technology - without any loss to the data. 

If it were later determined that frame-to-frame prediction could improve the lossless 
compression factor by up to 6: 1 for some images - then, on those images, the net gain through 
the downlink would approach 36:1 . 

Bandwidth Expansion. To operate the K = 7, v = 2 codes at such low Ej/N,, would 
require a channel symbol rate which is a factor of 4 higher than required of the uncoded data link 
to achieve Point C (i.e., 12:1+ compared to Point A). Some other less powerful codes (e.g., with 
v = 4/3) could get close to these results with a channel symbol rate increase closer to 3:1. 

Going the other way, a further data rate improvement of about 1 dB could be achieved 
with more powerful convolutional codes (e.g., K = 15, v = 4) and greater bandwidth expansion. 

Error Events and Prs 

An error event on the uncoded channel occurs whenever an individual bit error occurs. 
Such independently occurring random errors will tend to be distributed uniformly throughout a 
data stream. 

An error event on a channel which incorporates the RS coding occurs only when an RS 
codeword is in error (i.e., when more than 16 symbol errors occur). Since a codeword is roughly 
2000 bits in length, this means that, with a codeword error rate of P RS , an error event will occur 
once every 


2xl0 3 


RS 


(5) 
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Taking P RS = 10' 6 as assumed for Point C in Fig. 4, this would mean one error event every 

2 x 10 3 x 10 6 = 2 x 10 9 bits (6) 

Accepting a decrease in data rate by a tiny 0.1 dB (i.e., at 3.3 dB) would increase this figure to 
one error event every 

20 Gbits (7) 

or close to one error event every 

3 Gbytes (8) 


Surely, the term "virtually error free” is appropriate for this kind of performance, and 
suggests the possibility of an error-free mission. 

Effect of Error Events. A decoded RS symbol stream will remain entirely error-free 
unless a codeword error occurs. When a codeword is in error, more than 16 symbol (byte) errors 
have occurred and so they are not correctable by the 16 error-correcting code decoder. (In most 
cases, there will be 17 symbol errors remaining). Such symbol errors will be randomly 
distributed throughout a codeword, including its non -information parity symbols. We wish to 
consider the worst case impact of such errors on compressed image data. 

Figure 5 provides a somewhat idealized illustration which should be adequate for this 
purpose. Here a sensor produces source image lines which are 512 pixels long, with each pixel 
quantized to 8 bits/pixel. Images are delineated by parentheses, ’(’ and ’)’ in this figure. The 
central three source lines in Fig. 5 are labelled LI, L2 and L3. 

A source line is assumed to be prefaced by a synchronization word (SYNC), of length £ 
bits. The total length of an uncompressed source line is then 4096 + e bits, where e « 4096. 

Now assume, somewhat idealistically, that the data part of each line is compressed by 2: 1 
(e.g., using a lossless Rice Algorithm system as currently envisioned) and that each compressed 
line is preceded by a synchronization word (SYNC) and a Reference Sample (REF), of combined 
length £’ . The length of a compressed line is then 2048 + e’ bits , where e’ « 2048 . 
Uncompressed lines LI, L2 and L3 in Fig. 5 become compressed lines cL 1? cLj and cL 3 . 

If an error is encountered by a decompressor it will very likely cause the effect of the 
error to propagate until the decompressor can be re-initialized. The SYNC and REF assumptions 
provide the assurance that, with very high probability, a decompressor can be reset at the start 
of each line. Thus, the impact of "an error" can be confined to a line. Then a worst case 
assumption for the impact of a single error is that it will cause the complete loss of the line it 
occurs in (i.e., as if the error occurred at the start of the line). 

Under this assumption of worst case damage from a single error, multiple errors occurring 
within a line are no more damaging than one. This bounds the problem in a practical way. 
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Uncompressed Lines 



Compressed Lines 


204B + e' BITS 



]784 1784 Informa+ion Bits !734 

of RS codeword 


Fig. 5. Error Effects on Compressed Lines 
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Now consider Fig. 5 again. The lower part of the diagram assumes that the compressed 
lines are passed through a J = 8, E = 16 RS coder without interleaving (I = 1). Generally, the 
stan of a line and the stan of an RS codeword will not line up as Fig. 5 illustrates. Here the 
square brackets, T and ’]’ , are used to show the stan and end boundaries of the 1784 
Information Bits of sequential RS codewords. 

Suppose that the codeword marked A in Fig. 5 is in error. Extending our worst case 
criteria, compressed lines with any portions that lie within this RS codeword are lost. In this 
example, lines LI and L2 would be lost (although we can easily see that only the end of LI 
would be damaged). 

Summary Observations. Combining these observations with those in (8) we have, for 
a system incorporating basic lossless image compression: 

Operating the RS coded link at an Et/N,, > 3.3 dB 
(corresponding to an improvement factor in data rate 

of = 3:1 over the uncoded data link), would incur a worst (9) 

case loss of only two lines of image data once every 6 Gbytes 
of source data! (That is, once per mission?) 

By comparison, 

An uncoded data link operating at a bit-error probability of 

P b = 10 6 (i.e., at an E b /N 0 = 10.8 dB, or a data rate only 1/3 

of the proposed RS coded system), would incur a worst case (10) 

expected loss of 24.000 lines of image data for the same 

6 Gbytes source data. 


Effect of Error Events II. Some additional observations can be made. 


Interleaving 


Interleaving might be incorporated to extend the burst-error correcting ability of the RS 
coding and/or to avoid (E,/N 0 ) performance degradation when operating in combination with a 
convolutional inner channel. 

The built-in interleaving approach of available chips is called Interleave B in Ref. 2. In 
this case, increasing interleave depth, I , will spread the errors of a single codeword over 
additional compressed lines. Specifically, errors may occur over a stretch of 17841 bits, instead 
of only 1784 bits. 

Performing the interleave A approach in Ref. 2 would assure that the effect of a single 
codeword error remained confined to only 1784 data bits regardless of interleave depth, I . 


B-11 



More Sophisticated Compression 


If the lossless compression system now being investigated, and illustrated in Fig. 5, were 
extended to include two-dimensional prediction (to improve performance by = 0.5 bits/pixel), 
individual lines would no longer be independent. The effect of an error could propagate over 
multiple lines (instead of only to the end of one). This problem would be further compounded 
if prediction between image frames were employed. Errors could, in this case, impact multiple 
frames . Similar situations can be associated with the application of "lossy" algorithms too.* 

Certainly, the advantages of the virtually error-free environment provided by Reed- 
Solomon coding to these situations is obvious. 

TECHNOLOGY AVAILABILITY 

Reed-Solomon coder and decoder chips with the desired J = 8 , E = 16 parameters have 
been developed by the University of New Mexico under the direction of GSFC.' 61 The CMOS 
coder chips are space qualified. 

• Radiation-hardened to 3 x 10 6 rads total dose 

• SEU-hardened to less than 1 x 10 10 error per bit-day 

• Pin selectable interleaving, 1=1-8 

• 200 Mbits/s capability 

COST™ 


Decoders. $2K per 4-chip decoder. 

Flight Coders. Minimum order of 25 chips at $1200 per unit, plus $7K testing per lot, 
for a total of » $37K. 


*Current de-encryption algorithms have a similar difficulty with errors as compressed data. 
Many errors are generated from a single channel error. 
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